A very fast string matching algorithm for smallalphabets and long patterns (
نویسندگان
چکیده
We are interested in the exact string matching problem which consists of searching for all the occurrences of a pattern of length m in a text of length n. Both the pattern and the text are built over an alphabet of size. We present three versions of an exact string matching algorithm. They use a new shifting technique. The rst version is straightforward and easy to implement. The second version is linear in the worst case, an improvement over the rst. The main result is the third algorithm. It is very fast in practice for small alphabet and long patterns. Asymptotically, it performs O(log m(m + n=(m ? log m))) inspections of text symbols in the average case. This compares favorably with many other string searching algorithms.
منابع مشابه
انتخاب کوچکترین ابر رشته در DNA با استفاده از الگوریتم ازدحام ذرّات
A DNA string can be supposed a very long string on alphabet with 4 letters. Numerous scientists attempt in decoding of this string. since this string is very long , a shorter section of it that have overlapping on each other will be decoded .There is no information for the right position of these sections on main DNA string. It seems that the shortest string (substring of the main DNA string) i...
متن کاملA GPGPU Implementation of Approximate String Matching with Regular Expression Operators and Comparison with Its FPGA Implementation
In this paper, we propose an efficient GPGPU implementation of an algorithm for approximate string matching with regular expression operators, originally implemented on an FPGA, and compare the GPGPU, FPGA and CPU implementations by experiments. Approximate string matching with regular expression operators is used in various applications, such as full text database search and DNA sequence analy...
متن کاملFast Packed String Matching for Short Patterns
Searching for all occurrences of a pattern in a text is a fundamental problem in computer science with applications in many other fields, like natural language processing, information retrieval and computational biology. In the last two decades a general trend has appeared trying to exploit the power of the word RAM model to speed-up the performances of classical string matching algorithms. In ...
متن کاملView to String Matching Algorithms ? Ricardo
We present a uniied view to sequential algorithms for many pattern matching problems, using a nite automaton built from the pattern which uses the text as input. We show the limitations of deter-ministic nite automata (DFA) and the advantages of using a bitwise simulation of non-deterministic nite automata (NFA). This approach gives very fast practical algorithms which have good complexity for ...
متن کاملTowards a Very Fast Multiple String Matching Algorithm for Short Patterns
Multiple exact string matching is one of the fundamental problems in computer science and finds applications in many other fields, among which computational biology and intrusion detection. It turns out that short patterns appear in many instances of such problems and, in most cases, sensibly affect the performances of the algorithms. Recent solutions in the field of string matching try to expl...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 1998